Skip to content

DOC: Extended the documentation for DataFrame.sort_values() #57678

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 5, 2024

Conversation

aallahyar
Copy link
Contributor

The current docstring for DataFrame.sort_values() is lacking clarity, and also contains links that are not converted to hyperlinks. I extended and rephrased part of the documentation for this method. In particular:

  • Added further explanation to compare the single-column vs. multi-column sorting,
  • Extended explanation for customized sorting via key argument,
  • Provided a simplified version of natsort example, with added explanation.

…rther explain the single-column vs. multi-column sorting; added further explanation and simplification for customized sorting, e.g, using `natsort` package
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for improving this docstring @aallahyar. Your changes look great, but while reviewing I spotted an error in an example in the original docstring. If you want to fix it, that would be amazing.

Also, for next time, it's in general better to use a branch to open PRs. If you only contribute this PR it shouldn't make a difference, but if you open a second PR to pandas, things will become wild with git (changes here will be mixed with the new PR...).

@@ -6891,6 +6913,8 @@ def sort_values(
3 48hr 40
4 96hr 50
>>> from natsort import index_natsorted
>>> index_natsorted(df["time"])
[0, 3, 2, 4, 1]
>>> df.sort_values(
... by="time", key=lambda x: np.argsort(index_natsorted(df["time"]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example seems wrong. The df["time"] inside of the lambda should be the x received as the parameter. Do you mind updating it since you are improving this docstring?

@datapythonista datapythonista added Docs Sorting e.g. sort_index, sort_values labels Feb 29, 2024
@datapythonista
Copy link
Member

/preview

Copy link
Contributor

Website preview of this PR available at: https://pandas.pydata.org/preview/pandas-dev/pandas/57678/

@datapythonista
Copy link
Member

The rendered version of the changes here is available here, in case it's useful.

@aallahyar
Copy link
Contributor Author

Thanks @datapythonista for reviewing the update!

Would be happy to fix the issue. Please take a look.

Also thank you for the guidance!
It is indeed my first contribution to a large project, so I have a lot to learn 😄

@aallahyar
Copy link
Contributor Author

@datapythonista could you please let me know if any other modification is needed for this PR?

Thanks :)

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the work on this @aallahyar very nice. Looks perfect now (I'll let a second core dev have a look before merging).

@mroeschke mroeschke added this to the 3.0 milestone Mar 5, 2024
@mroeschke mroeschke merged commit 7988029 into pandas-dev:main Mar 5, 2024
@mroeschke
Copy link
Member

Thanks @aallahyar

pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request May 7, 2024
…-dev#57678)

* DOC:extended the documentation for `pandas.DataFrame.sort_values`; further explain the single-column vs. multi-column sorting; added further explanation and simplification for customized sorting, e.g, using `natsort` package

* shortened the added dostrings to 80 columns; fixed a typo

* added another `shell` line to avoid `micromamba` test failure

* fixed a typo in a `DataFrame.sort_values()` example

* added a warning to raise awareness about a potential issue with `natsort`

* simplified the examples

* added a single example about `natsort`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Sorting e.g. sort_index, sort_values
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants